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Reliability is crucial to safety* Redundancy of important system components greatly 
enhances reliability and hence safety. Field-Programmable Gate Arrays (FPGAs) are useful 
for monitoring systems and handling the logic necessary to keep them running with minimal 
interruption when individual components fail. A complete microcontroller watchdog with 
logic for failure handling can be implemented in a hardware description language (HDL), 
II DL -based designs are vendor-independent and can be used on many FPGAs with tow 
overhead. 


Nomenclature 


ADC 

— 

Analog-to-Digital Converter 

DAC 

= 

Digital -to- Ana log Converter 

DMR 

— 

Dual Modular Redundant 

FPGA 

= 

Field-Programmable Gate Array 

HDL 

= 

Hardware Description Language 

IDE 

- 

Integrated Development Environment 

IEEE 

- 

Institute of Electrical and Electronics Engineers 

I/O 

= 

Input/Gutput 

IP Core 

= 

intellectual Property Core. A functional black box unit provided for use in an FPGA design as 
licensed intellectual property. 

K-Map 

= 

Karnaugh Map 

LAS 

= 

Launch Abort System 

LFSR 

= 

Linear Feedback Shift Register 

VHDL 

= 

VHSIC (Very High Speed Integrated Circuits) Hardware Description Language 

POR 

= 

Power-On Reset 

HOT 

= 

Watchdog Timer 


L Introduction 

S afety is the cornerstone of the National Aeronautics and Space Administration's (NASA) core values. 1 

Extraordinary accomplishments often come with extraordinary risks. One way to minimize risk and account for 
the unexpected is through system redundancy* By duplicating (or even triplicating) the most critical elements of a 
system, overall reliability is enhanced* The trade-off for greater reliability through redundancy is additional 
overhead in terms of total cost, weight, system complexity, or other factors. For matters involving safety, however, 
reliability should not be compromised to the greatest extent possible* 

The basic implementation of redundancy is a primary and alternate pair The alternate takes control when the 
primary fails. Redundancy can also exist in systems such as a bridge with many suspension cables, where the failure 
of individual cables steadily degrades the reliability of the bridge. In the bridge example, redundancy is inherent. 
With a primary and alternate pair, often some form of active monitoring must be present to detect the failure of the 
primary and bring the alternate online* This process for monitoring and switching to ensure system reliability is 
known as voting logic. : 

Implementing a system watchdog with a Field-Programmable Gate Array (FPGA) is an example of voting logic* 
FPGAs allow custom logic circuits to be designed and programmed as de facto hardware. An FPGA's logic can be 
simulated quickly, processes can be performed in parallel, and can be easily reconfigured if problems are found or 
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updated later with improvements. The possible failure of the voting logic itself however implemented, is an 
additional consideration. For example, sensitivity to radiation at the gate level is a concern," Therefore, selection of 
appropriately radial ion -tote rant devices may be critical to the reliability of such a voting logic application. 

The specific application discussed herein is for a Dual Modular Redundant (DMR) microcontroller pair. In this 
DMR system, the two microcontrollers are fed the same inputs and operate in parallel. The output of one processor 
is actively controlling the end- item devices (valves, thrust vector controllers, solid state switches, etc,) while the 
output of the other (standby) controller is inhibited. When the primary fails, the standby resumes the desired 
functions with minimal interruption. Each microcontroller generates a heartbeat signal that is monitored by an 
FPGA. A disruption in the heartbeat of the initially-active microcontroller will cause the FPGA to switch control to 
the alternate microcontroller. 

Although the system considered is DMR, the techniques can be extended to several levels of redundancy. If the 
probability of failure for each identical element is considered independent, then the probability of failure for the 
system decreases dramatically. Mathematically, the probability of two independent events A and B both happening is 

P{A and B) = P{A n B) = P(A)P(B) (1) 

If the probability of such failures were exactly equal for the identical elements A , then the probability of total 
failure F decreases exponentially with the number of redundant layers x , since P(^) is hopefully less than one. 

P(F) = P(A) X (2) 

These conditions are ideal, however, and never realizable. Additionally, the conditions that lead to the failure of 
one element may influence the failure of another identical dement. So if the probability of failure is reasonably low 
to begin with, only a few layers of redundancy are needed to greatly reduce the probability of total failure. 
Accordingly, the overhead associated with too much redundancy has diminishing return if the elements have the 
same vulnerabilities and will be subjected to the same stresses. 

11 Watchdog Realization 

A watchdog timer (WDT) is a timer of fixed or specified duration that must be renewed by the system being 
watched to avoid timing out. If the WDT expires, it is a secondary indication of some problem with the system 
under observation. Many modern microcontrollers and other embedded systems come with a WDT already 
integrated. The typical corrective action when the WDT expires is to reset the microcontroller. The timer durations 
are often limited to a few fixed durations. While this may be a sensible solution for remote sensors that periodically 
measure something mundane, the results could be disastrous for some critical component such as the processor 
controlling thrust vectoring for a Launch Abort System (LAS), 

For critical applications, monitoring the system with an external watchdog has several advantages. The external 
monitor may be designed to a certain set of specifications, such that it is not limited to the settings of the on-board 
WDT. The external monitor may be able to take corrective action sooner than the WDT would to reset. An external 
monitor will detect a complete failure of an element where the integrated WDT also fails. Most importantly, an 
external monitor can multiplex a functioning element to the output if an active element fails, thereby minimizing 
interruption in system operation. 

An FPGA is well-suited to perform the external monitoring and voting logic, since it is highly customizable, can 
perform a wide variety of tasks in parallel (perhaps in addition to the voting logic), and often offer a large number of 
input/output (I/O) pins. The features make an FPGA a robust and modular addition with an acceptably small 
footprint for many system designs. It may well be the case that an FPGA is already being used in a design for some 
other purpose, and the voting logic functionality can be added as a convenience. 

The caveat is that FPGAs are usually digital devices only, meaning that it is best suited for cases where the 
heartbeat or other parameter of a system being monitored is a digital signal. There are mixed -signal FPGAs capable 
of analog-to-digital conversion (ADC), digital -to-analog conversion (DAC) f and other functions commonly 
integrated with microcontrollers, while some FPGAs themselves incorporate a complete microcontroller. Such 
devices may allow for a combination of traditional serial programming in a language like C and parallel 
programming in a hardware description language (HDL).The implementation considered here is not for a mixed- 
signal device and assumes a digital heartbeat signal from the system being observed, 

A, Watchdog Structure 
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One consideration in designing a watchdog is what constitutes a failure based upon the heartbeat signal received. 
The simplest watchdog will renew its timer every time a pulse (also called a "'kick") is received as long as the timer 
has not expired. That may mean that a heartbeat signal which suddenly increases in frequency, one that slows and 
just barely pulses before the watchdog expires, or just remains at a high logic level will all satisfy the watchdog. 
This simple watchdog is just a window of time, where any pulse that does not exceed the allowed time limit is 
acceptable. This may work for many applications, but a much stricter watchdog is achievable. 

One benefit of considering more than just the rising pulse of a heartbeat is faster detection of potential failure. 
By considering the heartbeat signal in more detail, an inappropriate rising edge may be an earlier indicator of failure 
that allows a faster changeover to a still -functioning microcontroller. There are a few other indicators to check, and 
since the FPGA can operate these checks in parallel, there is not much overhead to using them other than a relatively 
small amount of FPGA gates. 

If the microcontroller outputs a digital signal heartbeat with a known duty cycle of 50% (the watchdog may be 
adjusted for any duty cycle, but 50% is considered here), the watchdog can use the expected rise and fall times to 
detect a failure sometimes faster than one clock cycle of the heartbeat. The watchdog detailed here will test for the 
following conditions: signal is stuck low does not rise fast enough, signal is stuck high or does not fall fast enough, 
signal rises outside an acceptable window of lime, or signal is completely lost (or tri-slated). Signal may be lost due 
to a dramatic occurrence such as the device suddenly catching fire or the connection may be weak. Either way, the 
FPGA will control the multiplexing of which microcontroller is active, so a lost signal is always a failure for this 
consideration. 

B. Counters 

The building block for realizing the aforementioned watchdog structure is a counter. For maximum flexibility 
and utility, the counter is described in Very' High Speed Integrated Circuits Hardware Description Language 
(VHDL). Each FPGA vendor provides its own Integrated Development Environment (IDE). These IDEs usually 
include a method to design the logic circuits on the FPGA using schematic capture, which is a visual representation 
of logic gates and boxes. A basic block such as a counter may have different characteristics from one IDE to 
another VHDL can be easily imported and modified for use with any FPGA, The counters described here will also 
include a few more features than are commonly provided with the schematic capture blocks that make them more 
adaptable to different heartbeat signals. 

A basic counter takes in a clock signal and increments or decrements an internal register with each specified 
clock event, either rising or falling edge. The counter may incorporate a reset signal input that resets the count to the 
original value and a terminal count output that is asserted if the counter reaches its final value without being reset. 

The first type of counter used in the watchdog is intended to create an acceptable window of time for a rising 
edge of the heartbeat signal. This is accomplished with an active-low output. The counter is reset with each rising 
heartbeat and is expected to reach terminal count before the next rising edge occurs, which should be during the 
finite terminal count window, 

A D flip-flop (seen as RtsingEdge Trigger in Fig, 3) is employed to detect a bad rising edge. The terminal 
count of the counter is fed to the D input of the 11 ip* Hop. and the heartbeat signal is fed to the Clk input. A rising 
edge of the heartbeat outputs the D input to the Q output of the flip-flop. If the rising edge occurs during an active- 
low terminal count, the output is low and there is no error. If the rising edge occurs outside the terminal count 
window, the output is high and there is an error. Note that although the rising edge resets the counter (and the 
terminal count goes high), the input of the flip-flop should be low at the time of the rising edge. The events happen 
in parallel, and there is always some propagation delay in FPGA's before gate states can make a logical transition. 
Accordingly, the output of the counter will go high when reset after the flip-flop acts on its original input. Figure 1 1 
illustrates the action of the acceptable window counter and watchdog in simulation. 





Figure 1. Counter for generating acceptable rising edge window. The first two heartbeats reset the counter 
while terminal count (D) is asserted low. The third rising edge is late , the full terminal count window is visible, and 
the error is detected The image is stretched for improved visibility of the waveforms. 


3 


Spring 2013 Session 




NASA USRP - Internship Final Report 


The second type of counter is a simple timeout where the terminal count itself is the error indicator. If the 
counter is not reset in time, the watchdog is not satisfied. To detect errors faster, two counters are employed as 
timeouts. One is reset by a rising edge and the other by a falling edge* Note that it is also possible to implement the 
edge window counter for a falling edge instead of a rising edge, or both for the potential to detect an error even 
faster. Figure 2 * illustrates the action of the timeout counter and watchdog in simulation for a rising edge reset. The 
action is the opposite for a falling edge reset* 



Bttwih(lojtat!,taar5aat 
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Figure 2 . Counter that times out without a rising edge* The first two heartbeats reset the counter normally The 
third rising edge is late and the error is detected. The image is stretched for improved visibility oj the waveforms 


A schematic visualization of the entire watchdog is pictured in Fig, 3 J . The actual counters and watchdog are 
written in VHDL, but the schematic capture here provides a block diagram view. 



Figure 3. Schematic capture view of the watchdog with counters. A microcontroller failure is inferred from an 
error indicator on any of the three counters watching the signal 

Using VHDL offers some advantages over the schematic capture, A major benefit is customization through the 
use of “generics.* Generics are used as numbers in a VHDL entity and have a default constant value, but a new 
value may be passed in when an entity is instantiated in a higher-level design. This can allow for a single entity, 
such as a counter, to be used in many different configurations. For instance, notice that the counter in Fig. 2 
increments the counter on a rising edge and has a maximum count of five. The counter is reset at the very last 
possible moment (the simulation is pre-synthesis and includes no timing), so there is a possibility that when timing 
is involved this will trigger many false errors. So for instance, the count may easily be increased to six to account for 
this possibility. 

An example implementation of this highly configurable counter is given below, 
library IEEE: 

use IEEE.STD_LOGICJ 164, ALL; 
use JEEE.STD_L0GIC_UNS1GNED.ALL; 
use IEEE.NUMER1C_STD.ALL; 

entity CustomCounter is 

generic ( 

— ! Adjust the duration of counter here. Default is for a system clock I Ox the pulse signal. 

Count : integer :- 5 : *-! Count 

Controls whether the edge timeout is for a high or low signal, 

—l ‘O' will create a falling edge timeout (limes out if pulse stays high) 

— ! T will create a rising edge timeout (times out if pulse stays low) [default] 
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EdgeType : stdjogic := f I*; — ! Sped fies type of timeout 

*-! Determines whether the counter acts on the falling [default] or rising edge of its driving clock. 
“I 'O' selects the counter to aei on the clock's falling edge 
“I T selects the counter to act on the clock's rising edge 

CounterDrive ; stdjogic ?T; — ! Specifics driving edge for counter clock 

*•! Determines whether output is active high or active low. Active low is used to create a 

--! window for acceptable clock events. 

*0' will be active high [default] 

« 1 T will be active low 

OutputHL : stdjogic :^T; —3 Specifies active High/Low output 


— ! Adjust the duration (number of clock cycles) for which the counter's terminal count signal 
— ! is asserted. Default is one clock cycle; increase for the rising edge window to have more 
— ! tolerance and account tor a pulse signal not synchronized with the system clock. 


— ! Value cannot exceed 

"Count 11 since it is reset at the end of the count. 

TCntDur 

); 

: integer r= 0 

— ! Duration of TCnt in clock cycles 

port( 



—l Inputs 



Clock 

; in stdjogic; 

-! Clock 

Reset 

: in stdjogic: 

Reset 

— ! Outputs 



TCnt 

; out stdjogic 

— ! Terminal Count (Active High) 


); 

end CustomCounter: 

architecture Behaviour of CustomCounter is 


—! The following function is a modification of the default risingedge function provided in the 
—! Institute of Electrical and Electronics Engineers (IEEE) standard library. 

-3 This function checks for a legitimate edge transition by checking the state previous to the clock event. 
—I Something like clock" Event and clock = '\\ while commonly used, does not account for the fact 
~! That the previous state may be other logic states, such as U or Z 
FUNCTION aclion edge (SIGNAL s : std ubgic) RETURN BOOLEAN IS 
BEGIN 

RETURN (s'EVENT AND (To_X0I(s) - CounterDrive) AND 
(To_X0 1 (s*LAST_ V A LU E) - not CounterDrive)); 

END; 


— Signal Deda rat ions 
signal Interna I Count 
signal AssertTime 
signal ResetCoum 
signal Tern! n tern a I 


: integer; 

: integer := TCnrDur; 

: std Jogie: 

: std Jogie OutputHL; 


-*! Maintains internal count 
Maintains TCnt Duration 
Indicates terminal count 
—! Initializes output to desired value 


begin 

Resc (Count <= T when IntemalCount = Count else f 0‘; 


— Processes) 

— ! Counting process that resets the internal count on asynchronous reset and loops after reaching linal count. 

Counting : process (Reset. Clock) 

begin 

if (Reset - EdgeType) then 
IntemalCount <= 0: 
clsi f aclion_edge( Clock) then 
if ResetCoum = T then 
IntemalCount <= 0; 
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else 

Interna I Count <= IntemalCoum + I : 
end if; 
end if; 
end process: 

— ! Will assert the terminal count high for specified number of clock cycles, 

Enabling : process (Reset Clock) 
begin 

if (Reset = EdgeTypeKhen 
TCnt Interna I <= OutputHL: 

AssertTime <- TCntDur; 
els if act ion_edge( Clock) then 
if ResetCounl — T then 

TCntlntemal <= not Output! IL; 

AssertTime <= 0; 

else 

if AssertTime = TCntDur then 
TCntlntemal <= OutputHL: 

else 

AssertTime <- AssertTime + I ; 

TCntlntemal <= not OutputHL : 
end if; 
end if; 
end if: 
end process: 

TCnt <= TCntlntemal: 

— End Architecture 
end Behaviour: 

To put these counters to use as a watchdog, as visualized in Fig, 3, they must be instantiated and some of the 
generics set. Below is the VHDL watchdog using the custom counter, 

library' IEEE; 

use IEEE.STD_LOGIC _l 1 64. ALL; 
use IEEE,$TD_LOGIC_lTNSIGN ED. ALL; 
use IEEE.NUMERIC_STD.ALL; 

--! @de tails 

— 1 The watchdog incorporates both a rising edge and falling edge timeout as well as 
— ! an acceptable window of tolerance for a rising edge. The +- tolerance of this window 
— ! is adjustable here by altering the counters or by changing the dock frequency. 

entity Watchdog is 

port ( 

-! Inputs 


Heart Beat 

: in STD LOGIC; 

—I Active-high reset. 

Clock 

: in STD LOGIC; 

“! C lock, 

Initialize 

: in STD LOGIC; 

— 1 Initialize to erase start-up error 


~! Outputs 

-! Each output has a different meaning. In this project, the ‘Error’ output will remain 
«! asserted forever if a failure event has ever occurred. The ’Status’ indicator is asserted 
— ! (asserted here is a logic T by default) whenever any kind of failure is detected, but is 
— ! not permanently asserted like the ’Error’ indicator. If the pulse signal eventually "feeds 
-I the w atchdog" back to a state of normalcy, the ‘Status’ output will remain unasserted , 

-! The remaining outputs indicate the particular type of failure, mostly for debugging and 
— ! possibly for detailed failure analysis. 

Error : out STD_LOGIC; — ! Indicates if a failure ever occurred. 
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Status 

: out STD LOGIC; 

~ 1 Present state of failure. 

Stuck High 

: out STD LOGIC; 

«1 Pulse is stuck high. 

Stuck Low 

: out STD LOGIC; 

Pulse is stuck low. 

Rad Rise 
); 

: out STD LOGIC 

«3 Rising edge outside acceptable window. 

end Watchdog; 

architecture Behav iour of Watchdog is 

—3 instantiations of sub-components* 
component CustomCounter 
generic ( 

— ! These are the same as in the CustomCounter code. 

but must be adjusted in the mapping, not here. 

Count 

: integer ;= 5: 

Count 

EdgeType 

: stdjogic :=T; 

— ! Specifics type of t imeout 

CounterDrive 

: stdjogic :=T; 

— ! Specifies dri ving edge for counter clock 

OutputHL 

; stdjogic l- T; 

Specifies active High/Low output 

TCntDur 

: integer := 0 

— ! Duration of TCnt in clock cycles 


**! Ports 
port { 

**! Inputs 

Clock ; in stdjogic; — ! Clock 
Reset : in stdjogic; --! Reset 


— ! Outputs 

TCnt ; out stdjogic —1 Terminal Count 

); 

end component: 


— ! Signal Declarations 

“1 Initialize to some known value when/if possible to avoid unknown logic states. 


signal D 
signal Q 
signal NoFallingTidge 
signal NoRisingEdgc 
signal BadRisingEdge 
signal StatusSig 


; std jogic := 'O': 
; stdjogic := f 0‘; 
: stdjogic 'O': 
: stdjogic := 'O’: 
: std jogic := 'O’; 
: stdjogic := *0': 


— ! Begin Behaviour 
Begin 


(^Instantiations 

Fat ling Edge! imeoul : CustomCounter 

--! @detaits Times out w ithout falling edge after specified interval 


--I Detects a pulse signal stuck high. 
Generic MAPI 
Count => 5, 

Edge Type => V\ 

CounterDrive => T, 

OutputHL => *0\ 

TCntDur => 5 

) 

** port map 
port map( 

— Inputs 
Clock => Clock. 

Reset -> HeartBeat 
TCnt => NoFallingEdge 
): 


~1 Times out after 5 clock cycles 

—! Falling edge resets counter 

**f Counts on rising edge 

— ! Active-high output 

--! Timeout asserted for l clock cycle 
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RisingEdgeTimeout : CustomCounter 
— ! rgdetails limes out without rising edge alter 
--! Detects a pulse signal stuck tow. 

Generic MAP( 

Count => 5. 

Edge Type => 1 1 \ 

CounterDrive ■> T, 

OutputHL => * 0 \ 

TCntDur =o 5 

) 

— port map 
port map{ 

— Inputs 
Clock => Clock* 

Reset “> MeartBeat 
TCm => NoRisingEdge 
); 


specified interval. 

Times out after 5 dock cycles 
— ! Rising edge resets counter 
Counts on rising edge 
-I Active-high output 
— ! Timeout asserted for I clock cycle 


RisingEdge Window : Custom Counter 
“! ©details Creates a window for an acceptable 
~! Detects a sudden incorrect pulse rise. 

Generic MAP( 

Count => 4, 

EdgeType => T t 
CounterDrive => r 0\ 

OutputHL => T, 

TCntDur => 0 

) 

— port map 
port map{ 

~ Inputs 
Clock => Clock* 

Reset => I leanBeat* 

TCnt => D 

); 

— ! @End Instantiations 

— ! @ Processes 
DFF : process (Heart Beat) 

--! @details Implements a D flip-flop to ensure the rising edge of the microcontroller pulse is within the 

window established by the Rising Edge Window instantiation. The active -low output of the RisingEdge Window 
*-! is fed to the data input of the D flip- Hop, while the pulse signal is treated as the 11 ip- 11 op's clock 
—! signal. The rising edge of the pulse will cause the data input to appear on Q* the flip-flop's output. 

— ! Since the window created by RisingEdge Window is active- low. any rising edge pulse signal that occurs 
— ! outside this duration of time will cause the norm ally -high output of RisingEdge Window to appear on Q 
—1 and indicate an error. 

— ! Note that to reduce false triggers rising_edgc( signal) is used instead of (signal* event and signal = T). 

— 1 The difference is that ris*ng_edgeO ensures the previous logic state was 'O'* whereas elk'event detects 

— ! a change from any state, including Z. U, X. W* L, I L 

begin 

if rising_edge{ MeartBeat) then ~! Check for "Enable" of Flip-Flop 

Bad Rising Edge D; 

end if; — ! Q unchanged without enable and dock 

end process; 

DFF2 : process (StatusSig) 

— ! ©details Latches onto errors permanently* or ignores them if being initialized 
begin 

if (Initialize = T) then 

Q <- *0 f : 

d $ i f (S latusS ig = ' I ’ ) the n — ! Check fo r "Ell able " of Ft ip- F T lo p 


rising edge dock event. 

— ! Times out after 4 clock cycles 
— ! Rising edge resets counter 
Counts on falling edge 
— ! Active-low output 
— ! Timeout asserted for 1 dock cycle 
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end if; — ! Q unchanged without enable and clock 

end process: 

— ! @ End Processes 

— ! @de tails Error signal is triggered from any of the error sources, 

StuekHigh <= NoFall ingEdge: 

Stuck! ,.ow <= NoRlsingEdge; 

BadRisc <- BadRisingEdge: 

SlatusSig <= NoRlsingEdge or NoFall ingEdge or BadRisingEdge; 

Status <= SlatusSig; 

Error <= Q: 

End of Architecture Body 
end Behaviour: 


I1L Voting Logic 

In addition to a watchdog for detecting errors, a layer of voting logic is necessary for a DMR system to function. 
The first step is to monitor both microcontrollers with the watchdog. Once an error is detected, the specifications 
and designer decide what to do with that information. Standard practice with a simple digital design may be to use a 
Karnaugh Map (K-Map) for finding a simplified realization. With just a few added I/O options the K-map can 
become tedious, especially if the system has greater complexity than dual redundancy. The underlying voting logic, 
which is to select a healthy active component, is very straightforward when requirements are kept to a minimum. A 
simple schematic capture of an early version of the DMR voting logic used here is pictured in Fig. 4 4 . 



Figure 4. Schematic capture view of simple voting logic. The voting logic here also accounts for an initial 
microcontroller selection and an override 

VHDL again can provide a relatively simpler, more easily modified ( without studying a mass of logic gates each 
time) solution that could transfer readily to FPGAs from different vendors. Some example code is given below. 

library IEEE: 

use IEEE.STD_L0G1CJ I64.ALL: 

use I EEE.ST D_ LOGIC_ HNS I GNE P. ALL: 

use IEBE.NUM E RICJSTD .ALL; 

«! ^details 

-! Switches from an initially-selected processor to an alternate one when a failure 
—I of the initial processor is indicated. Overrides are possible for extreme cases 
— 1 where both processors fail. 

entity ProcessorSetect is 

generic ( 

— ! Choose the initially active processor. 
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'O’ is Processor l [default] 

— ! T is Processor 2 

InitProeessor : stdjogic := 'O' — f Initial Processor Select 

}; 

port J 

— Inputs 

--! Indicates whether the failure status of a processor has ever been triggered. 

MCUlFail : in stdjogic: — ! MCUl Failure Indicator 

MCU2Fail : in stdjogic: — ! MCU2 Failure Indicator 

—I The status indicator monitors the original failure trigger. The MCUlFail and MCU2Fail 
— ! are permanently latched from these signals. Ordinarily, once a processor has an indicated 
— ! failure it should never be active again. In the event that both processors fail, however, 

— ! the original status trigger can be monitored to select the most viable alternative* 

MCU I Status ; in stdjogic; -I MCU I current status 

MCIJ 2 Status ; in stdjogic; MCU 2 current status 

“! Enables manual override* must be disabled by default! 

’O' is disable 
T is enable 

Force Enable : in stdjogic; — ! Override MCU Selection 

“! Processor to select once manual override enabled. Should not be unknown! 

— ! 'O’ is Processor t 
— ! T is Processor 2 

ForccSelect ; in stdjogic; — ! Override select for MCU2 

— Outputs 

— ! The output selects and indicates the appropriate processor. Only one at a time may ever be active. 

-! 'O' is Processor 1 . 

T is Processor 2, 

Selected Processor ; out std_ logic — ! Selects appropriate processor. 

>; 

end Processor Select; 

architecture Behaviour of ProcessorSclect is 

— Signal Declarations 

signal ActiveProcessor : stdjogic := InitProeessor: —! Active Processor 

begin 

— Processes) 

~! Selects alternate processor in the event of failure; decides what to do when both fail. 

Selection : process (Force Enable) 
begin 

if(ForceEnable = T) then -1 Manual override 

ActiveProcessor <= ForceSelect; — ! Forced processor selection 

else —I No override: normal operation 

if (MCUl Fail = T) then 

ActiveProcessor <= T: — ! Choose Processor 2 when I fails 
elsif (MCU2Fail = T) then 

ActiveProcessor <~ ’O'; Choose Processor I when 2 fails 
elsif (MCU 1 Fail = T and MCUlFail = T) then — ! Both processors have failed 

if (MCU I Status = T and MCUlStatus = ’O') then — ! Processor 2 watchdog currently good 
ActiveProcessor <= T; “1 Select processor 2 

elsif (MCU 1 Status - 'O’ and MCUlStatus = T) then ~! Processor I watchdog currently good 
ActiveProcessor <= 'O'; —I Select processor I 
else 

ActiveProcessor <= InitProeessor — ! Ensure a processor is always selected 

end if; 
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else 

A dive Processor <= ImtPraeesson No failures: con li nue to use initial, 

end if; 

end if; 
end process: 

Selected Processor <= Active Processor: 

— End Architecture 
end Behaviour: 

Many other voting logic algorithms are possible, depending on the application. Writing the logic in VHDL will 
help simplify the revision process if needed. For instance, if both processors fail a linear feedback shift register 
(LFSR) might be used to pseudo-randomly choose one rather than the initial default. That design may be more 
useful for applications requiring more layers of redundancy. Figure 5 4 shows the overall DMR microcontroller 
handling {watchdog plus voting logic) block diagram. 



Figure 5* Overall block diagram of the watchdog with voting logic. The watchdogs monitor the microcontrollers 
and the voting logic tries to keep the system functioning with minimal interruption in the event of a failure. Mote the 
Power-On Reset (FOR) circuit. This initializes the system to a known state when first powered and is vendor- 
specific. 


IV, Considerations 

The POR circuit seen in Fig. 5 is necessary to ensure the system starts up properly and without any false failures 
while the system clock and heartbeat signals are stabilized. The one pictured is vendor^speciflc to Microsemi/Actel 
FPGAs and was found in an application note". One benefit of the circuit is the ability to modify how long the 
initialization lasts. Similar PORs may be available for other vendors, and it is worth consideration to ensure that all 
logic starts at a known state. 

The schematic capture and VHDL code found herein is provided for example only. Actual implementation is 
specific to a particular application and must take the appropriate requirements into account. The intent was to 
convey a simple watchdog capable of monitoring both dock edges for identifying a variety of possible failures very 
quickly. 

Capabilities of FPGAs may vary, and each vendor offers its own set of Intellectual Property Cores (IP Cores) for 
use in designing. These cores may be extremely useful hut are not open source and available to use on other FPGAs. 
This may limit the reuse of a particular design, no matter how much was written in standard VHDL, without access 
to comparable IP Cores from other vendors. 
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V. Conclusion 

Redundancy can provide a dramatic increase in reliability. Reliability is not to be compromised where safety or 
high cost are concerned. Voting logic and watchdogs are necessary for many redundant systems. The use of FPGAs 
allows for this functionality to be added with a low footprint or readily added to designs incorporating FPGAs. 
Using VHDL for FPGA design allows for more modular use of the design, as it is not vendor-specific. Maintaining 
well-managed and we 11 -commented VHDL designs will allow for reuse of code in many future designs 
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